Skip to content

feat(types): migrate db, graph algorithms/builders, and domain/queries to TypeScript (Phase 5.5)#570

Merged
carlos-alm merged 40 commits intomainfrom
feat/ts-migrate-phase5.5
Mar 23, 2026
Merged

feat(types): migrate db, graph algorithms/builders, and domain/queries to TypeScript (Phase 5.5)#570
carlos-alm merged 40 commits intomainfrom
feat/ts-migrate-phase5.5

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Migrate 19 remaining JS files to TypeScript across src/db/, src/graph/, and src/domain/
  • Add typed interfaces for Leiden algorithm internals (GraphAdapter, Partition, DetectClustersResult)
  • Add LockedDatabase type, DependencyGraphOptions, TemporalGraphOptions for graph builders
  • Align vendor.d.ts with BetterSqlite3Database — generic Statement<TRow>, open/name properties
  • Fix pre-existing type errors in module-map.ts and generator.ts

Files migrated

Module Files
src/db/ connection, migrations, query-builder, index barrel
src/graph/algorithms/leiden/ adapter, cpm, modularity, optimiser, partition, index
src/graph/algorithms/ louvain, index barrel
src/graph/builders/ dependency, structure, temporal, index barrel
src/graph/classifiers/ index barrel
src/graph/ index barrel
src/domain/ queries barrel

Test plan

  • tsc --noEmit passes with zero errors
  • 1971/1972 tests pass (1 pre-existing native/WASM parity failure)
  • All 187 graph + community tests pass
  • Biome lint clean (no errors, only pre-existing warnings)

…housekeep

Four recurring maintenance routines as Claude Code skills:
- /deps-audit: vulnerability scanning, staleness, unused deps, license checks
- /bench-check: benchmark regression detection against saved baselines
- /test-health: flaky test detection, dead tests, coverage gap analysis
- /housekeep: clean worktrees, dirt files, sync main, prune branches
…line

- Replace 2>/dev/null with output=$(... 2>&1) + exit_code check on all
  four benchmark invocations so error messages are captured and recorded
- Add division-by-zero guard in Phase 3: when baseline == 0, mark delta
  as "N/A — baseline was zero" (informational only, not a regression)
- Add git add + git commit step in Phase 5 so the baseline file is
  actually committed after each save, matching the documented rule
- After reverting package.json + package-lock.json on --fix test failure,
  also run `npm ci` to resync node_modules/ with the restored lock file;
  without this the manifest is reverted but installed packages are not
- Add explanatory comment on @anthropic-ai/tokenizer skip-list entry
  clarifying it is a peer dependency of @anthropic-ai/sdk and may be
  required at runtime without an explicit import in our code
…erion

- Phase 5 (Update Codegraph): add source-repo guard that skips the
  self-update logic when running inside the codegraph source repo;
  comparing the dev version to the published release and running
  npm install is a no-op since codegraph is not one of its own deps
- Phase 1b stale-worktree criterion: replace "created more than 7 days
  ago" (not determinable via git worktree list) with "last commit on the
  branch is more than 7 days old AND branch has no commits ahead of
  origin/main", using `git log -1 --format=%ci <branch>`
When Phase 0 stash push is a no-op (manifests unchanged), Phase 7
was calling stash drop/pop on the wrong entry. Track STASH_CREATED
exit code and branch on it: use git checkout when no stash exists.
…ent corruption

Replace hardcoded /tmp/test-health-runs/ with mktemp -d so parallel
sessions get isolated directories. Add cleanup at end of analysis.
…plicit commit paths

Add 4th verdict path for --save-baseline when baseline already exists.
Replace corrupted em-dash character in N/A string. Change commit command
to use explicit file paths per project convention.
…ress

Phase 5 was listed as "2 of 7 complete" with outdated pre-Phase 3 file
paths. Updated to reflect actual state: 32 of 269 source modules migrated
(~12%). Steps 5.3-5.5 now list exact migrated/remaining files with verified
counts (5.3=8, 5.4=54, 5.5=175, total=237 JS-only files). Added note about
14 stale .js counterparts of already-migrated .ts files needing deletion.
When STASH_CREATED=1 and tests pass, the npm audit fix changes are
good — no action needed. Previously it ran git checkout to discard
them, which undid the successful fix.
git diff --quiet ignores untracked files, so on the first run when
baseline.json and history.ndjson are newly created, the commit was
skipped. Stage first with git add, then check with --cached.
Branch deletion now asks for user confirmation before each delete,
consistent with worktree removal in Phase 1c.
- bench-check: add timeout 300 wrappers to all 4 benchmark invocations
  with exit code 124 check for timeout detection
- bench-check: add explicit COMPARE_ONLY guard at Phase 5 entry
- housekeep: fix grep portability — use grep -cE instead of GNU \| syntax
- test-health: add timeout 180 wrapper in flaky detection loop
- test-health: fix find command -o precedence with grouping parentheses
When both stat variants (GNU and BSD) fail, $size is empty and the
arithmetic comparison errors out. Add a [ -z "$size" ] && continue
guard so the loop skips files whose size cannot be determined.
…check (#565)

Phase 6: when SAVE_ONLY or first-run (no prior baseline), write a
shortened report with "Verdict: BASELINE SAVED" instead of the full
comparison report.

Phases 1a-1d: replace ambiguous "If timeout / If non-zero" with
explicit "If timeout / Else if non-zero" so the two conditions are
clearly mutually exclusive.
Phase 4 (Resolution Accuracy) had all 6 sub-phases merged but status
still said "In Progress". Phase 5 (TypeScript Migration) had 5.3-5.5
merged via PRs #553, #554, #555, #566 but was listed with stale counts.
Updated both to reflect actual state: Phase 4 complete, Phase 5 at 5/7
with 76 of 283 modules migrated (~27%).
Previous commit incorrectly marked 5.3-5.5 as complete. In reality
76 of 283 src files are .ts (~27%) while 207 remain .js (~73%).
PRs #553, #554, #555, #566 migrated a first wave but left substantial
work in each step: 4 leaf files, 39 core files, 159 orchestration
files. Updated each step with accurate migrated/remaining counts.
The /review skill allowed replying "acknowledged as follow-up" to
reviewer comments without tracking them anywhere. These deferrals
get lost — nobody revisits PR comment threads after merge.

Now: if a fix is genuinely out of scope, the skill must create a
GitHub issue with the follow-up label before replying. The reply
must include the issue link. A matching rule in the Rules section
reinforces the ban.
…s to TypeScript (Phase 5.5)

Migrate 19 remaining JS files to TypeScript across db/, graph/, and domain/:
- db/: connection, migrations, query-builder, index barrel
- graph/algorithms/leiden/: adapter, cpm, modularity, optimiser, partition, index
- graph/algorithms/: louvain, index barrel
- graph/builders/: dependency, structure, temporal, index barrel
- graph/classifiers/: index barrel
- graph/: index barrel
- domain/: queries barrel

Key type additions:
- GraphAdapter, Partition, DetectClustersResult interfaces for Leiden
- LockedDatabase type for advisory-locked DB instances
- DependencyGraphOptions, TemporalGraphOptions for graph builders
- Generic Statement<TRow> in vendor.d.ts for type-safe DB queries

Also fixes pre-existing type errors in module-map.ts (untyped prepare
calls) and generator.ts (null vs undefined argument).
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 23, 2026

Greptile Summary

This PR completes Phase 5.5 of the TypeScript migration by converting 19 JS files across src/db/, src/graph/algorithms/leiden/, src/graph/builders/, and src/domain/ to TypeScript, adding typed interfaces for the Leiden algorithm internals, and aligning vendor.d.ts with the project's internal BetterSqlite3Database abstraction.

All six issues flagged in the previous review round have been correctly addressed:

  • Partition.graph is now graph?: GraphAdapter (optional) with an honest graph: undefined initializer
  • computeQualityGain throws explicitly when partition.graph is unset
  • vendor.d.ts restores the type-preserving transaction<F extends (...args: any[]) => any>(fn: F): F signature
  • module-map.ts COUNT(*) queries use as { c: number } | undefined with ?.c ?? 0
  • g.inEdges truthiness check simplified to if (g.directed)
  • DetectClustersOptions converted to a type alias

One remaining style concern:

  • src/domain/search/generator.ts:73openDb(dbPath) as BetterSqlite3.Database introduces the vendor ambient type (vendor.d.ts) alongside the internal BetterSqlite3Database abstraction (types.ts), making this the only file in the codebase that mixes both. The closeDb call still works at runtime, but changing initEmbeddingsSchema's parameter type to BetterSqlite3Database would eliminate the cast and restore consistency.

Confidence Score: 4/5

  • Safe to merge — all previous critical issues resolved, one minor style inconsistency remains in generator.ts.
  • All six prior review comments have been addressed correctly. The remaining issue (vendor-type cast in generator.ts) is a style concern that does not affect runtime correctness — advisory lock release still happens, and tsc passes. The .js files have been cleanly deleted with no dual-format residue, all typed interfaces are accurate, and the 1971/1972 test pass rate matches the pre-existing native/WASM parity failure unrelated to this PR.
  • src/domain/search/generator.ts — minor type-cast inconsistency on line 73.

Important Files Changed

Filename Overview
src/db/connection.ts Migrated to TypeScript with new LockedDatabase type, proper openDb/openReadonlyOrFail cast patterns, and openRepo helper. Double-cast through unknown to handle better-sqlite3's callable-not-constructor export pattern is safe and necessary. Overall clean migration.
src/vendor.d.ts Restores type-preserving transaction<F extends (...args: any[]) => any>(fn: F): F signature (matches better-sqlite3's own types.ts), adds open and name readonly properties. Previous round's type-safety regression is addressed.
src/graph/algorithms/leiden/partition.ts graph?: GraphAdapter correctly declared as optional. graph: undefined initialization is now honest. Previous undefined as unknown as GraphAdapter type lie is gone. Closure-vs-property duality is clearly documented.
src/graph/algorithms/leiden/optimiser.ts All three previously flagged issues are resolved: partition.graph guard throws explicitly, inEdges truthiness check simplified to if (g.directed), computeQualityGain type narrowing is correct. Code is in good shape.
src/graph/builders/dependency.ts New DependencyGraphOptions interface is well-typed. The minConfidence Repository path correctly filters in JS with the noted performance trade-off. MinConfidenceEdgeRow local interface avoids exposing the narrow SQL result type.
src/domain/analysis/module-map.ts Three COUNT(*) queries now correctly typed as `as { c: number }
src/domain/search/generator.ts The openDb(dbPath) as BetterSqlite3.Database cast at line 73 mixes vendor ambient type with internal BetterSqlite3Database abstraction. closeDb(db) compiles because __lockPath is optional, but the pattern is inconsistent with the rest of the codebase.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["makePartition(graph)"] --> B["Partition object\ngraph?: GraphAdapter = undefined"]
    B --> C["Returned to optimiser"]
    C --> D["partition.graph = graphAdapter\n(set before first use)"]
    D --> E["computeQualityGain(partition, v, c, opts)"]
    E --> F{partition.graph?}
    F -- "falsy" --> G["throw Error:\n'partition.graph must be set\nbefore computeQualityGain'"]
    F -- "set" --> H{opts.quality}
    H -- "cpm" --> I["diffCPM(partition, partition.graph, v, c, γ)"]
    H -- "modularity" --> J["diffModularity(partition, partition.graph, v, c, γ)"]
    I --> K["Return quality gain Δ"]
    J --> K
Loading

Comments Outside Diff (1)

  1. src/domain/search/generator.ts, line 73 (link)

    P2 Vendor type cast creates two parallel DB abstractions in one file

    openDb returns LockedDatabase (which is BetterSqlite3Database & { __lockPath?: string } from types.ts). The as BetterSqlite3.Database cast forces a switch to the vendor's ambient type from vendor.d.ts. The codebase uses BetterSqlite3Database everywhere else as the internal DB abstraction, so this file is now the only place that holds both types simultaneously.

    The cast works at runtime because the underlying object still has __lockPath set (so closeDb(db) on line 168 correctly releases the advisory lock), and TypeScript accepts closeDb(db) because __lockPath is optional on LockedDatabase. But the intent becomes opaque: a reader seeing db: BetterSqlite3.Database would not know that this was opened with openDb and carries an advisory lock that must be released.

    The cleaner fix is to update initEmbeddingsSchema (and any other local functions in this file) to accept BetterSqlite3Database (the internal type from types.ts) instead of BetterSqlite3.Database, eliminating the need for the cast:

    // Change the local helper signature:
    function initEmbeddingsSchema(db: BetterSqlite3Database): void { ... }
    
    // Then remove the cast:
    const db = openDb(dbPath);

    This keeps the entire file using the same DB abstraction as the rest of the codebase and makes the advisory-lock lifecycle visible from the LockedDatabase type.

Reviews (3): Last reviewed commit: "fix: simplify dead interface and truthy ..." | Re-trigger Greptile

getCommunityMembers,
getCommunityTotalSize: getCommunityTotalSizeFn,
getCommunityNodeCount: getCommunityNodeCountFn,
graph: undefined as unknown as GraphAdapter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Type lie for graph initialization

graph: undefined as unknown as GraphAdapter initializes the property to undefined at runtime while telling TypeScript it's a valid GraphAdapter. Since the Partition interface declares graph: GraphAdapter as non-optional, any consumer that reads partition.graph before the optimiser assigns it will get undefined at runtime while TypeScript believes it's always safe to access.

This is particularly fragile because the closure methods (initializeAggregates, accumulateNeighborCommunityEdgeWeights, etc.) use the function-parameter graph from makePartition(g), not this.graph. The data property graph on the returned object is separate and only set later by the optimiser (partition.graph = graphAdapter). The current design has two distinct "graph" concepts that are easy to conflate.

The type lie is technically safe today since the optimiser always sets partition.graph before computeQualityGain reads it — but this contract is not enforced by TypeScript. The interface should reflect the true runtime state:

// Option A: make it optional in the interface (the honest type)
graph?: GraphAdapter;

// Option B: keep as non-optional but remove the initialization lie, 
// passing graph in makePartition itself and removing the external-set pattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — made Partition.graph optional (graph?: GraphAdapter) and removed the undefined as unknown as GraphAdapter type lie. The returned object now has graph: undefined which honestly reflects the runtime state.

Comment on lines +511 to +515
if (quality === 'cpm') {
return diffCPM(partition, partition.graph || {}, v, c, gamma);
return diffCPM(partition, partition.graph || ({} as GraphAdapter), v, c, gamma);
}
// diffModularity dispatches to diffModularityDirected internally when g.directed is true
return diffModularity(partition, partition.graph || {}, v, c, gamma);
return diffModularity(partition, partition.graph || ({} as GraphAdapter), v, c, gamma);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Ineffective guard for unset partition.graph

partition.graph || ({} as GraphAdapter) appears to be a defensive guard for when partition.graph has not been set yet. However, since the Partition interface declares graph: GraphAdapter as non-optional, TypeScript considers the || branch dead code and will not warn when it is actually reachable.

At runtime (given graph: undefined as unknown as GraphAdapter in partition.ts line 466), if partition.graph is somehow undefined, the fallback ({} as GraphAdapter) would be used — but this empty-object cast also doesn't implement GraphAdapter, so diffCPM/diffModularity would immediately throw trying to access ({}).totalWeight, ({}).n, etc. The guard provides false safety.

The code is safe today because partition.graph is always explicitly set by the optimiser before computeQualityGain is called. But the intent of this guard should be made explicit. Consider either:

// Assert the contract is satisfied
if (!partition.graph) throw new Error('partition.graph must be set before computeQualityGain');
return diffCPM(partition, partition.graph, v, c, gamma);

or simply remove the fallback if the invariant is guaranteed:

return diffCPM(partition, partition.graph, v, c, gamma);
return diffModularity(partition, partition.graph, v, c, gamma);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the partition.graph || ({} as GraphAdapter) fallback with an explicit assertion: if (!partition.graph) throw new Error('partition.graph must be set before computeQualityGain'). The empty-object cast is removed entirely.

src/vendor.d.ts Outdated
exec(sql: string): Database;
transaction<T extends (...args: unknown[]) => unknown>(fn: T): T;
// biome-ignore lint/suspicious/noExplicitAny: must match better-sqlite3's generic Transaction<F>
transaction<T>(fn: (...args: any[]) => T): (...args: any[]) => T;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 transaction<T> type change erases parameter types

The previous signature was:

transaction<T extends (...args: unknown[]) => unknown>(fn: T): T;

This preserved the exact function type, meaning callers got back a function with the same call signature as the one they passed in — argument types included.

The new signature:

transaction<T>(fn: (...args: any[]) => T): (...args: any[]) => T;

Erases all argument types for the returned transaction wrapper. This means callers can now invoke the returned transaction with any arguments without a TypeScript error:

// Was a type error before, now silently accepted:
const myTx = db.transaction((id: number) => { ... });
myTx("not-a-number"); // no TS error with new type

The biome-ignore comment acknowledges this is intentional for better-sqlite3 compatibility, but it's worth noting this is a type-safety regression for all transaction call sites. If the project uses transactions with typed parameters, this could silently mask call-site mistakes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — restored the type-preserving signature transaction<F extends (...args: any[]) => any>(fn: F): F which matches types.ts and preserves argument types for the returned wrapper.

Comment on lines +361 to +371
const totalNodes = (
db.prepare<{ c: number }>('SELECT COUNT(*) as c FROM nodes').get() as { c: number }
).c;
const totalEdges = (
db.prepare<{ c: number }>('SELECT COUNT(*) as c FROM edges').get() as { c: number }
).c;
const totalFiles = (
db.prepare<{ c: number }>("SELECT COUNT(*) as c FROM nodes WHERE kind = 'file'").get() as {
c: number;
}
).c;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unsafe get() cast strips undefined possibility

db.prepare<{ c: number }>(...).get() as { c: number } casts away the | undefined that get() always returns per the SqliteStatement<TRow> type (get(...): TRow | undefined). The chained .c access will throw at runtime if get() returns undefined.

While COUNT(*) in SQLite always returns exactly one row (making this safe in practice), the same pattern appears three times in the moduleMapData function (lines ~362, 364, 366). A more robust approach would preserve the undefined signal:

Suggested change
const totalNodes = (
db.prepare<{ c: number }>('SELECT COUNT(*) as c FROM nodes').get() as { c: number }
).c;
const totalEdges = (
db.prepare<{ c: number }>('SELECT COUNT(*) as c FROM edges').get() as { c: number }
).c;
const totalFiles = (
db.prepare<{ c: number }>("SELECT COUNT(*) as c FROM nodes WHERE kind = 'file'").get() as {
c: number;
}
).c;
const totalNodes =
db.prepare<{ c: number }>('SELECT COUNT(*) as c FROM nodes').get()?.c ?? 0;

This pattern should also be applied to the totalEdges and totalFiles queries below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced the as { c: number } casts with as { c: number } | undefined and optional chaining (?.c ?? 0) for all three COUNT(*) queries in moduleMapData.

# Conflicts:
#	src/domain/analysis/module-map.ts
#	src/vendor.d.ts

Impact: 1 functions changed, 0 affected
Make Partition.graph optional to reflect runtime state (undefined until
optimiser sets it). Replace ineffective `|| ({} as GraphAdapter)` fallback
with an explicit assertion in computeQualityGain.
#570)

Use optional chaining for COUNT(*) queries in moduleMapData to avoid
stripping undefined. Restore type-preserving transaction<F> signature in
vendor.d.ts. Add missing src/graph/builders/index.ts barrel export.
@carlos-alm
Copy link
Contributor Author

@greptileai

queue.push(w);
}
}
if (g.directed && g.inEdges) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant g.inEdges truthiness check — always true per typed interface

The GraphAdapter interface in adapter.ts declares inEdges: InEdgeEntry[][] as a required, non-optional property. Therefore g.inEdges is always a (possibly empty) array and is always truthy. The second condition in:

if (g.directed && g.inEdges) {

is dead code that was carried over from the original JavaScript where the value could legitimately have been undefined.

With the typed interface enforcing that inEdges is always present, this should be simplified to:

Suggested change
if (g.directed && g.inEdges) {
if (g.directed) {

This avoids misleading readers into thinking inEdges can be absent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — simplified to if (g.directed) since g.inEdges is always present per the typed interface.


export type { LeidenOptions } from './optimiser.js';

export interface DetectClustersOptions extends LeidenOptions {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Empty extends interface — prefer a type alias

DetectClustersOptions extends LeidenOptions {} declares an empty interface that adds nothing over its parent. TypeScript/Biome guidelines recommend using a type alias for structural equivalence with no additions:

Suggested change
export interface DetectClustersOptions extends LeidenOptions {}
export type DetectClustersOptions = LeidenOptions;

This avoids the lint warning about empty interfaces and makes the intent clearer (these two types are identical, not just related through inheritance).

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced empty interface with export type DetectClustersOptions = LeidenOptions.

@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 299e99b into main Mar 23, 2026
16 checks passed
@carlos-alm carlos-alm deleted the feat/ts-migrate-phase5.5 branch March 23, 2026 11:23
@github-actions github-actions bot locked and limited conversation to collaborators Mar 23, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant